Nonlinear Inverse Reinforcement Learning with Gaussian Processes

نویسندگان

  • Sergey Levine
  • Zoran Popovic
  • Vladlen Koltun
چکیده

We present a probabilistic algorithm for nonlinear inverse reinforcement learning. The goal of inverse reinforcement learning is to learn the reward function in a Markov decision process from expert demonstrations. While most prior inverse reinforcement learning algorithms represent the reward as a linear combination of a set of features, we use Gaussian processes to learn the reward as a nonlinear function, while also determining the relevance of each feature to the expert’s policy. Our probabilistic algorithm allows complex behaviors to be captured from suboptimal stochastic demonstrations, while automatically balancing the simplicity of the learned reward structure against its consistency with the observed actions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inverse Optimal Control

In Reinforcement Learning, an agent learns a policy that maximizes a given reward function. However, providing a reward function for a given learning task is often non trivial. Inverse Reinforcement Learning, which is sometimes also called Inverse Optimal Control, addresses this problem by learning the reward function from expert demonstrations. The aim of this paper is to give a brief introduc...

متن کامل

Inverse Reinforcement Learning via Deep Gaussian Process

We propose a new approach to inverse reinforcement learning (IRL) based on the deep Gaussian process (deep GP) model, which is capable of learning complicated reward structures with few demonstrations. Our model stacks multiple latent GP layers to learn abstract representations of the state feature space, which is linked to the demonstrations through the Maximum Entropy learning framework. Inco...

متن کامل

Kernel Least-Squares Temporal Difference Learning

Kernel methods have attracted many research interests recently since by utilizing Mercer kernels, non-linear and non-parametric versions of conventional supervised or unsupervised learning algorithms can be implemented and usually better generalization abilities can be obtained. However, kernel methods in reinforcement learning have not been popularly studied in the literature. In this paper, w...

متن کامل

Unsupervised Learning for Nonlinear PieceWise Smooth Hybrid Systems

This paper introduces a novel system identification and tracking method for PieceWise Smooth (PWS) nonlinear stochastic hybrid systems. We are able to correctly identify and track challenging problems with diverse dynamics and low dimensional transitions. We exploit the composite structure system to learn a simpler model on each component/mode. We use Gaussian Process Regression techniques to l...

متن کامل

On the Numeric Stability of Gaussian Processes Regression for Relational Reinforcement Learning

In this work we investigate the behavior of Gaussian processes as a regression technique for reinforcement learning. When confronted with too many mutually dependant learning examples, the matrix inversion needed for prediction of a new target value becomes numerically unstable. By paying attention to using suitable numerical techniques and employing QR-factorization these instabilities can be ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011